The Cityscapes Dataset
نویسندگان
چکیده
Semantic understanding of urban street scenes through visual perception has been widely studied due to many possible practical applications. Key challenges arise from the high visual complexity of such scenes. In this paper, we present ongoing work on a new large-scale dataset for (1) assessing the performance of vision algorithms for different tasks of semantic urban scene understanding, including scene labeling, instance-level scene labeling, and object detection; (2) supporting research that aims to exploit large volumes of (weakly) annotated data, e.g. for training deep neural networks. We aim to provide a large and diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5000 frames in addition to a larger set of weakly annotated frames. The dataset is thus an order of magnitude larger than similar previous attempts. Several aspects are still up for discussion, and timely feedback from the community would be greatly appreciated. Details on annotated classes and examples will be available at www. cityscapes-dataset.net. Moreover, we will use this website to collect remarks and suggestions.
منابع مشابه
Multiple Skip Connections and Dilated Convolutions for Semantic Segmentation
We propose a scale-aware semantic segmentation method specifically for small objects. The contributions of this method are 1) to feed the features of a small region by using multiple skip connections, and 2) to extract context from multiple receptive fields by using multiple dilated convolution blocks. The proposed method has achieved high accuracy in the Cityscapes dataset. In comparison with ...
متن کاملSupplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network
In this supplementary material, we include more detailed qualitative and quantitative results on the VOC and SBD datasets. Furthermore, we also show the runtime of our algorithm. Figures 1 and 2 show success and failure cases of our algorithm. Figure 3 compares the results of our algorithm to the publicly available model for MNC [3]. Figure 4 compares our results to those of FCIS [6], concurren...
متن کاملSemantic Foggy Scene Understanding with Synthetic Data
This work addresses the problem of semantic foggy scene understanding (SFSU). Although extensive research has been performed on image dehazing and on semantic scene understanding with weatherclear images, little attention has been paid to SFSU. Due to the difficulty of collecting and annotating foggy images, we choose to generate synthetic fog on real images that depict weather-clear outdoor sc...
متن کاملSemantic segmentation - using Convolutional Neural Networks and Sparse Dictionaries
The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This t...
متن کاملEfficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of PolygonRNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) signif...
متن کامل